My Corpus choice
My corpus consists of a collection of songs from various genres including mainly Pop, Rap, and Phonk (which is a sub genre of hip hop and trap music). These three genres are divided into three playlist for easier comparisons among them. Something interesting to keep in mind: I personally listen to these different genres under various circumstances. For example I tend to listen mostly to Pop when studying, while when working out I like to listen to Rap to give me that extra energy, and finally when gaming I tend to listen mostly to Phonk to focus in a game. Also note that both the Rap and Phonk playlist are from spotify’s catalog while the Pop playlist was made by myself. However the Pop playlist is actually more of a mix of popular songs I gathered from over the years. Since most of them are therefore classified as Pop, I will just refer to this playlist as the Pop genre playlist. It will be interesting to see if there are some songs in this Pop playlist which could actually better have been placed in one of the other two playlist according to spotify’s API features. So basically this corpus was chosen because I want to explore the differences and similarities among these various genres in terms of their musical characteristics and find out which songs typically fit into each category. Initially these three main genres within my corpus seem very different, so I thought it would be interesting to see if there were any specific overlapping musical characteristics which makes them so compelling for my taste in music.
Comparisons
The natural comparison points in my corpus are therefore the different genres: Pop, Rap, and Phonk. I expect to see differences in terms of rhythm, melody, harmony, and lyrics. For example, I expected Pop music to have a more upbeat rhythm and catchy lyrics compared to Rap music which might have a heavier beat and very different flowing lyrics. However, I also expected to see some similarities across genres, for example Rap and Phonk often seem to have a similar type beat and rhythm: heavy and fast paced. It will be interesting to explore the similarity and differences regarding the instruments and music keys/notes used in typical songs of each genre.
The tracks in my corpus are representative of the groups I want to compare, as I selected them based on their popularity and mainstream recognition within each genre. However, it is possible that my corpus might not cover all sub-genres within each main genre, and therefore may not represent the full scope of each genre.
Some examples of typical tracks in my corpus are “Invincible” by Pop Smoke, a popular Rap/Hiphop song, and “Counting Stars” by OneRepublic, a popular Pop song with its iconic vocal harmonies and theatrical elements. Atypical tracks in my corpus include “Sweater Weather” by the Neighbourhood, which is a mix between Indie rock and alternative rock, and “break from toronto” by PartyNextDoor, which could be classified as R&B or Hiphop but also contains a form of rap. These atypical songs may cause there to be some outliers in the data. These outliers may have to be filtered out first in order to be able to make the most fair comparison for each genre.
In conclusion, this portfolio will focus on exploring and analyzing the musical characteristics of Pop, Rap, and Phonk, to find out what makes them appealing to listen to under various circumstances.
Explanation
This bar chart gives a good representation of the differences in key usage between the three genres. I was wondering if there would be a unique key pattern for each genre and thus if an extremely different type of key distribution would be visible for each of the three genres. I expected to see a great use of the C keys for pop followed by the G and F keys, since I remembered reading that these typically seem to have the highest usage rates. So it was no surprise that this was actually the case. However I was surprised to see how balanced the key distribution of the Pop genre was, since the other genres seem to have a lot more imbalances. Furthermore I expected the Rap genre to have the highest usage of the E key, followed by the A and G keys. However to my surprise it was actually the C# key, and the other keys did not even come close to this usage rate. Since Phonk has roots of Trap and Hip-hop I expected to see a lot of use of the F or G keys. For the F key this was absolutely false, however for the G key it was relatively true since there was a lot of usage of this key. Although it was definitely not the most used key which actually was the C#.
So something interesting to look at is the most used key per genre, for example both Phonk and Rap make use of the C# key the most. I did expect similar key usage in Phonk and Rap since the style of beat seems similar. However it seems that in Rap songs, this key is used almost twice as much as any other key on average, while in Phonk is seems to be relatively balanced with the other keys. It is also interesting to see that Phonk does not have a single instance of the D# key. Also we can clearly see that out of the three genres, Pop seems to have the most balanced distribution of key usage since it uses all keys and does not really have one major outstanding key.
Explanation
I thought it would be interesting to have a look at the different tempi for each of the three genres. I expected to see a lot of similarities between them, since most songs I like and listen to typically seem to have a tempo within a certain range. I do not like extremely fast paced songs because it is hard to track the actual musical characteristics or sing/dance along. However I also don’t like extremely slow music since I tend to get bored easily from song which are classified as slow-songs.
Now for the actual differences in the tempi of the genres, interestingly even though both Phonk and Pop seem to have a preferred tempo between 120 and 130 beats per minute (BPM), the plot shows that Rap likes to have a little faster tempo on average: around 140 BPM. However the plot also does show a slight peak at 140 BPM for both Phonk and Pop. Since the rap playlist includes the least amount of songs out of the three genres, it seems that overall most songs I like are still around 120-140 BPM. Some other interesting thing to see is, that aside from the 140 BPM range , all three genres also have a small peak at around 95 BPM, which is a lot slower than the usual tempo of these genres. This could mean two things, the first one being that is does in fact not matter whether the tempo of a song is extraordinary for a specific genre. The song could still thrive in a genre regardless of the tempo if the song is still composed very well. The second cause could be that these peaks are caused by sub-genres within each of the main three genres, which could then mean that there are some overlapping sub-genres within the three main genres. It would be interesting to look deeper into the sub-genres within each main genre and see if there is some overlapping there which may have caused this kind of behavior.
Explanation
For this plot I decided to compare the valence and energy of The three genres: Pop, Rap and Phonk. Valence measures the positivity of a song’s musical content, while energy measures the intensity and activity of a song. These are two important factors that contribute to a song’s overall mood and emotional impact. I wanted to see if there was any correlation between the valence and the energy of songs. I thought it would also be interesting to see if the songs of these different genres would differ massively in overall mood and emotional impact.
The distribution interestingly immediately shows that like for the distribution of keys, Pop songs seem to be the most balanced and equally distributed. While Phonk songs all seem to lie towards the top of the energy scale and Rap songs seem to lie more around the middle of the energy scale. I expected the Pop genre to have the highest average valence out of the three genres, since these songs are often linked to positive vibes. And Phonk to have the highest energy values on average, since this genre is associated with fast, loud and noisy music (as can be seen in the plot where bigger points indicate louder songs). Therefore it was kind of surprising to see that pop had actually a really balanced valence distribution. What surprised me even more was to see that this was the case for all three genres, they were all pretty balanced towards the valence scale. Although it was no surprise that Phonk had a relatively high energy on average, I was surprised to find out that the Phonk genre actually also scored relatively high on average for valence too, since I assumed this fast and loud type of music would give off an angry vibe. I also expected to see Rap quite high on the energy scale since there are a lot of fast paced heavy beat rap songs out there, but this was not exactly the case. At least by comparison to the other two genres Rap really seemed to have a slightly lower energy score on average. However Rap songs do seem to have a way higher correlation between energy and valence, since this is the only graph where really a clear linear trend can be seen between these two measurements. This shows that there could be a correlation between the valence and energy of songs, however it is very dependent on the genre it seems.
Explanation
This boxplot shows that interestingly enough, rap seems to have the highest danceability value on average according to Spotify’s measure. Danceability is a measure of how suitable a track is for dancing, based on factors such as rhythm stability, beat strength, and overall tempo.
I would personally have guessed that Pop songs would have the highest danceability by a mile, since Pop music is often associated with upbeat, danceable tracks that are popular in clubs and on dancefloors. So seeing that Pop only had a median score of 0.69 for danceability, while Phonk had a median of 0.70 and Rap even a median as high as 0.79, was really surprising to me. I expected that Rap and Phonk music would have prioritized other musical characteristics over danceability, such as lyricism, groove, or atmospheric qualities. I also thought danceability could be influenced by energy scores, which is why I expected Phonk to do quite well. However, as can be seen in the graph, Rap wins the danceability battle by quite a big margin. I think it is really interesting to see Rap score so high, especially considering that rap had really low energy levels in comparison to the other two.
Explanation
Here you can see the mean tempo, the deviation of each song from its mean tempo, the loudness in the size of the datapoint and the bpm also in the color of the point for extra clarity. The slider can be moved to display all the elements of the selected genre, or the play button can be pressed to display each genre after each other like a slideshow.
This graph is useful to see why the histogram of tempi was distributed the way it was, as well as identifying all outliers which may have caused deviations from the standard tempo range for a genre. It is also intriguing to examine songs that exhibit variations in tempo within the same piece and determine the number of standard deviations by which this deviation differs from the mean tempo.
It is interesting to see that Phonk can almost be divided into three tempo ranges: around 100 BPM, around 120 BPM and around 140 BPM. This really does explain the histogram of tempi well for this genre. With the song help urself being the outlier for slowest average tempo of around 82 BMP and the song METAMORPHOSIS being the outlier for fastest average tempo of around 175 BMP.
It is interesting to see that Pop seems to have the most stable bpm range: more than 50% lies within the 120-140 bpm range and spreads out evenly on both sides. It also really only has one major outlier, the song Fireflies has an extremely fast tempo of around 180 BPM.
Something fascinating is that all songs of Pop and Phonk do not really deviate a lot from the main tempo within the piece, while there are some extreme deviations within rap songs. One example of this is the song family ties, which has a tempo STD of around 9.5. It also seems that Rap songs are all over the place when it comes down to mean tempo, so there are not really any major outliers here it seems.
Explanation
Timbre is a term used to describe the unique tonal quality or “color” of a sound, and timbral analysis can provide insights into the spectral and temporal characteristics of different genres of music. The 12 timbre coefficients measured by the Spotify API are based on Mel-frequency cepstral coefficients (MFCCs), which are commonly used in speech and music recognition to capture the spectral features of audio signals. These coefficients can help distinguish between different types of sounds based on their harmonic content, brightness, and other factors.
Here is the explanation of each of the 12 coefficients: “c01”: This coefficient represents the overall brightness or darkness of the audio segment, based on the spectral energy distribution. “c02”: This coefficient represents the spectral centroid, which is a measure of the center of gravity of the spectral energy distribution. It is related to the perceived “brightness” or “darkness” of the sound. ” c03”: This coefficient represents the spectral spread, which is a measure of the width of the spectral energy distribution. It is related to the perceived “sharpness” or “dullness” of the sound. ” c04”: This coefficient represents the spectral skewness, which is a measure of the asymmetry of the spectral energy distribution. It is related to the perceived “smoothness” or “roughness” of the sound. ” c05”: This coefficient represents the spectral kurtosis, which is a measure of the “peakedness” of the spectral energy distribution. It is related to the perceived “clarity” or “muddiness” of the sound. ” c06”: This coefficient represents the spectral flatness, which is a measure of the relative balance of energy across the frequency spectrum. It is related to the perceived “tonality” of the sound. ” c07”, ” c08”, ” c09”, ” c10”, ” c11” and ” c12”: all these coefficients capture the overall spectral shape of the sound within each frequency range.
Something interesting to see, is that the first coefficient is not very diverse since it does not have a lot of deviation from its mean value. This could be because the first coefficient, “c01,” which represents the overall brightness or darkness of the audio segment, captures a more general aspect of the audio signal. It represents the average energy distribution across the entire frequency spectrum, rather than focusing on specific frequency ranges or spectral characteristics. However the opposite is true for the rest of the first half of the coefficients, they have a lot of deviation, while the second halve tends to get more and more dense and deviate less.
Explanation
A chromagram is a visual representation of the distribution of pitches in a musical recording. The x-axis represents the time in seconds and the y-axis represents the pitch.
The first chromatogram shows that for the Pop song: Counting Stars by OneRepublic, the C# pitch has the longest time use by far and it seems that C comes second. The rest of the pitches seem to get less playtime in comparison. We can clearly see that after around 21 seconds (indicated by the first red line), there are some new pitches introduced. This is correct since this is the end of the intro of the song and all the instruments slowly start playing one after another. At around 78 seconds into the song (indicated by the second red line), the same thing happens: after this ending a section again all instruments start playing again. The last time this happens is at around 206 seconds, and you can really clearly hear the point where the music introduces all the instruments again.
The second chromatogram shows that for the Phonk song: Close Eyes by DRVST, the C pitch has the longest time use by far and it seems that the C#, F and E pitches follow regarding playtime. The rest of the keys do not seem to get much playtime in comparison. At around the halfway point in the song, we can hear a short pause from almost all heavy instruments before continuing the song. This is indicated with the red line at a time of around 70 seconds.
The third chromatogram shows that for the Rap song: Invincible by Pop Smoke, the F# pitch has the longest time use. We can exactly tell where the beat drops and the bass is introduced, which is at around 23 second indicated by the red line again. We can clearly see the introduction of the high intensity F# pitch at that point.
Explanation
Cepstrograms, also known as cepstral spectrograms, are a type of spectrogram where the x-axis represents the time in seconds and the y-axis represents the cepstral coefficients instead of the frequency. The ceptrograms show the changes regarding timbre of typical songs from each genre: Pop, Phonk and Rap.
Comparing the ceptrograms of three songs, Payphone from the Pop genre, Sahara from the Phonk genre and Yonkers from the rap genre, reveals interesting differences in their timbral distribution. We can immediately tell that all three songs have a clearer timbral structure towards the lower end of the coefficients, however there are some small differences between the songs.
In “Payphone” for example there is a relatively strong concentration of timbral characteristics around the second coefficient after the like the first 50 seconds, indicating a strong “brightness” or “darkness” of the sound in the areas where bright yellow accents can be seen.
The opposite is true for the coefficients in “Sahara”, there seems to be a strong concentration of timbral characteristics around the second coefficient only for like the first 25 seconds, indicating a strong “brightness” or “darkness” of the sound in the areas where bright yellow accents can be seen.
On the other hand, both “Yonkers” shows more variation in timbral distribution, with a wider range of timbral characteristics. It is only in the last 50 seconds that we really see the most concentration of the second timbral coefficient. This could suggest that this song has a more complex and varied sound texture. These differences in timbral distribution could contribute to the overall appeal of the songs and could be further explored in future analyses.
Explanation
A self-similarity matrix is a visual representation of the similarity
between different sections of a musical recording. The two
self-similarity matrices, each summarized at the bar level but with axes
in seconds, illustrate pitch- and timbre-based self-similarity within
the songs: Trumpets
from Jason Derulo,
METAMORPHOSIS
from INTERWORLD and Still
D.R.E. from Dr. Dre, Snoop Dogg, These songs were chosen
because I think it really shows the timbre and pitch based structure of
each genre well.
First of all we can see that each song is of different length, since these matrices are not the same size. We can identify repeated or similar musical motifs by looking for diagonal patterns in the SSM. The degree of similarity between the motifs can be estimated from the height and width of the diagonal patterns, as well as their intensity or color. Both “Trumpets” and “METAMORPHOSIS” seem to have a checkerboard pattern, which typically indicates a lack of similarity or structure in the musical piece. This pattern can occur when the musical piece has no consistent motif or structure, or when it is composed of different sections with little or no relation to each other. However it can also just mean that its features are constant over a duration of time. It is also interesting to see that that while their chroma similarity matrices look alike, their timbre similarity matrices look relatively different.
“Sill D.R.E” seems to only have repeating horizontal and vertical stripes without a clear pattern. This happens when the musical piece has a repetitive structure, with repeated motifs or sections that are similar to each other.
Explanation
This is a keygram from one of the last songs from XXXTENTACION: SAD! brought out before he died the 18th of june 2018. It is interesting because this song is from my Pop playlist while it is actually classified as more of a Hiphop/trap/R&B song. I was interested to see if the keygram shows abnormalities compared to a normal Pop song.
However as can be seen the key estimates are really blurry, this means that the algorithm has a hard time finding a stable estimation of what the actual key usage is. The sparser texture of the song throws off the key-finding algorithm seriously. I am wondering what makes this song especially hard for the algorithm to structure, since in comparison to other songs I found that this song had one of the least clear key estimates. I think this does make the musical characteristics of the song more interesting to analyze further in the future. So in conclusion the difference between this song and songs actually classified as Pop songs, is that Pop songs tend to have a little clearer structure while this song :SAD! is especially unclear in its keygram.
Explanation
In the Phonk playlist there is the song: MOZART PHONK, from which the chordogram can be seen in the top plot. I thought It would be fun to compare the chordogram of that song with the actual song from Mozart: Piano Sonata No. 11, which can be seen in the plot below. The x-axis represents the time in secends whereas the y-axis represents the chords played at those times.
We can see that some chord structure of the original song is kept, even though a couple of lot chords might have been altered or completely changed for extra energetic effect. It seems that the MOZART PHONK has a higher coverage of bright yellow squares, indicating that there are a lot of chords with extremely high energy. While the original song seems to have a little more balance between high energy chords and low energy chords. However when zooming in on the original song, so that it’s domain is ranging from 0-70 seconds, we can see that the two songs are really quite similar in regard to their chord profile. So although I believe it is slightly difficult to compare the two songs since the duration of the original song is a lot longer, we can still clearly see a lot of resemblance of the original song in the Phonk version of the song.
Explanation
The Fourier-based tempogram at the left is an attempt to use Spotify’s API to analyse the tempo of PARTYNEXTDOOR: Break from Toronto Overall, Spotify estimates that the tempo of this track is 117 BPM, looking in more detail at the tempogram, we can see that this makes sense.The bright yellow line does indeed lie around 117 BPM, however we can also see that there are some indicators that higher and lower tempos are also included in the beginning. These different sections of the piece lead to different strengths of so-called tempo octaves. This freer-form section of the piece can make it so tempo estimation is a little harder.
This song could be classified as one of the outliers in my Pop playlist, since the genre is actually more R&B/Soul, however the song still only falls slightly below the average tempo for a Pop song namely: 120-130 BPM. This can be seen in the histogram of tempi on the visual analysis page.
Wrap up of my corpus analysis
To conclude the analysis of my corpus, I would like to highlight a few key takeaways that I gained from creating this portfolio. I expected a lot of differences between all of the musical characteristic from the three genres: Pop, Phonk and Rap. As I mentioned in the introduction of my portfolio, I listen to the songs from these various genres under different circumstances. Therefore I would’ve guessed that there would be a lot of different features of these songs, which would make them appealing to listen to during different activities. However to my surprise I found out that the these differences were mostly only slight changes in either tempo, timbre or pitch. It was really interesting to find out that all these little changes in songs added up to make a song fit into a certain genre instead of all in the same one.
First of all seeing the histogram of the different keys used for each genre and seeing that there were big differences, I expected all other measurements to have the same outcome. This expectation turned out to be not really accurate for all musical characteristic comparisons.
One example of these small differences was when analyzing the different tempi of each genre, they all seemed to fall almost perfectly into the same range with an average of 120-140 BPM. Where the difference as little as 5 BPM on average was the difference between complete different genres. But when looking deeper into it, I found that the most noticeable thing wasn’t the difference of the average tempo between genres. It was the distribution of songs for each genre and the deviation of tempo within the songs itself. For example we saw that the Rap genre had some major outliers which were deviating from the mean tempo of the song. While the Pop songs where distributed with almost like a normal distribution type shape. On the other hand, Phonk songs could almost be dived into one of three tempos.
Furthermore I expected to see a lot more similarities between Rap and Phonk, since both of these tend to have a similar type beat and rhythm: heavy and fast paced. While this was sometimes the case, it definitely was not always true. For example, while the energy of Phonk was really high on average, the energy of Rap was way lower than I expected it to be especially in comparison to Pop for example. It was still not low overall but compared to the genres I was analyzing it just seemed lower than expected.
Something else noticeable was seeing that according to the spotify API, the Rap genre includes the most danceable songs on average. This was unexpected too, since Pop is know for it’s dancing songs. Not only was Rap the most danceable genre, Pop was actually the least danceable.
Interestingly I did find that for most comparisons Pop was overall the most balanced and evenly distributed with its features. The spread of datapoints was usually the highest which would lead to a more balanced representation of songs. I also felt like the Phonk genre was usually the least balanced with the least amount of spread in it’s datapoints. Rap seemed to fall somewhere in the middle of the two Genres generally speaking.
Moreover I found it interesting to see how all of the chroma features had different impacts on the structure of songs and how it could effect the way a music piece was perceived. While not all songs had a truly clear structural representation for each chroma representation, like for the keygram, it was still interesting to see the differences between songs from each genre.
All in all have learnt about my taste in music and the different type of songs I listen to. I have gained a lot of insight into the different musical characteristics which play major roles into shaping music pieces into the way they are and now also now which ones are important in the genres I listen to.



What’s next?
While we have analyzed the playlist data and have found that there are several overlapping features and differences between the different genres, it is important to note that these playlists are constantly updating, so they may not be entirely accurate in the future.
One interesting aspect to consider when analyzing the overlapping songs between genres is the subgenres within each main genre. For example, within the broad category of “Rap” there are many subgenres such as Gangsta rap, Alternative hip hop, and even Trap, which coincidentally also has a lot overlapping features with Phonk.
Analyzing the subgenres within each main genre could shed light on some of the overlapping songs and outliers in the data. It is possible that some of the overlapping songs belong to subgenres that are closely related, while others may be outliers that do not fit neatly into any one category. By analyzing the subgenres, we may be able to gain a more nuanced understanding of the relationship between different genres and their respective fan bases.
Furthermore, analyzing subgenres could also help to identify emerging trends and new subgenres that are gaining popularity. For example, the rise of subgenres such as “indie pop” and “trap music” in recent years has reshaped the landscape of popular music and could be reflected in the Pop playlist data.
However, it is important to note that analyzing subgenres would require a more granular approach to data collection and analysis, which may not be feasible or practical for all research projects. Nonetheless, it is an interesting avenue for future research and could lead to a deeper understanding of the complex relationships between different genres and their respective subcultures.


